Neighbourhoods of examples for detecting logical redundancy
نویسندگان
چکیده
In previous work we proposed an algorithm, REFER, for removing logically redundant features in a dataset consisting of Boolean examples, each labelled with one of any number of possible class labels. We define redundant features as those which can be removed without compromising the learning of a classification rule. Such redundant features are said to be covered by another present feature. Disjoint subsets of examples of the same class, called neighbourhoods, are used to both permit feature reduction in multi-class problems and to enable more features to be removed. A further benefit of using neighbourhoods is that redundant features can be detected as a result of being covered by a combination of features. In this paper we review the REFER method, demonstrate how this effect comes about, and discuss adaptations which take advantage of this effect. 1 Feature reduction for classification Classification is one of the fundamental tasks in machine learning and has been extensively studied. In the usual classification setting, input or training data consists of multiple examples, each having multiple attributes or features. Each example is tagged with a class label. The goal is to learn the target concept (in the context of this paper, a rule) associated with each class by finding regularities in examples of a class that characterise the class in question and discriminate it from the other classes. Propositionalisation [4, 5] is an approach to knowledge discovery and data mining tasks in multirelational databases. In propositionalisation, a set of clauses are generated according to the relational structure, and a propositional dataset is generated in which the truth values of the clauses form the features of the dataset. This transformed dataset can be used as input for conventional attribute-value learning systems. The process of propositionalisation tends to produce large numbers of highly-correlated features or features which are logically redundant in the presence of others. Redundant features may be defined as those features in a dataset which contribute nothing to the process of concept learning. In the simple case where features are always false (or true), such redundancy may be found during a preprocessing stage, but in the general case they may only be found as a result of deeper analysis of the data. The problem of feature selection has been tackled extensively in machine learning, usually because it is believed that removing poor features (by some quality measure) will lead to better classification performance [6, 8] and more comprehensible models [3, 2]. We draw a distinction between most established feature selection methods, which
منابع مشابه
Fault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit
Quantum-dot Cellular Automata (QCA) is an emerging and promising technology that provides significant improvements over CMOS. Recently QCA has been advocated as an applicant for implementing reversible circuits. However QCA, like other Nanotechnologies, suffers from a high fault rate. The main purpose of this paper is to develop a fault tolerant model of QCA circuits by redundancy in hardware a...
متن کاملFault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit
Quantum-dot Cellular Automata (QCA) is an emerging and promising technology that provides significant improvements over CMOS. Recently QCA has been advocated as an applicant for implementing reversible circuits. However QCA, like other Nanotechnologies, suffers from a high fault rate. The main purpose of this paper is to develop a fault tolerant model of QCA circuits by redundancy in hardware a...
متن کاملSolving a Redundancy Allocation Problem by a Hybrid Multi-objective Imperialist Competitive Algorithm
A redundancy allocation problem (RAP) is a well-known NP-hard problem that involves the selection of elements and redundancy levels to maximize the system reliability under various system-level constraints. In many practical design situations, reliability apportionment is complicated because of the presence of several conflicting objectives that cannot be combined into a single-objective functi...
متن کاملFuzzy Reliability Optimization Models for Redundant Systems
In this paper, a special class of redundancy optimization problem with fuzzy random variables is presented. In this model, fuzzy random lifetimes are considered as basic parameters and the Er-expected of system lifetime is used as a major type of system performance. Then a redundancy optimization problem is formulated as a binary integer programming model. Furthermore, illustrative numerical ex...
متن کاملReliability Optimization for Complicated Systems with a Choice of Redundancy Strategies (TECHNICAL NOTE)
Redundancy allocation is one of the common techniques to increase the reliability of the bridge systems. Many studies on the general redundancy allocation problems assume that the redundancy strategy for each subsystem is predetermined and fixed. In general, active redundancy has received more attention in the past. However, in real world, a particular system design contains both active and col...
متن کامل